Mining the Web for Discourse Markers

نویسنده

  • Ben Hutchinson
چکیده

This paper proposes a methodology for obtaining sentences containing discourse markers from the World Wide Web. The proposed methodology is particularly suitable for collecting large numbers of discourse marker tokens. It relies on the automatic identification of discourse markers, and we show that this can be done with an accuracy within 9% of that of human performance. We also show that the distribution of discourse markers on the web correlates highly with those in a conventional balanced corpus.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mining Discourse Markers For Chinese Textual Summarization

Discourse markers foreshadow the message thrust of texts and saliently guide their rhetorical structure which are important for content filtering and text abstraction. This paper reports on efforts to automatically identify and classify discourse markers in Chinese texts using heuristic-based and corpus-based data-mining methods, as an integral part of automatic text summarization via rhetorica...

متن کامل

How Does Explicit and Implicit Instruction of Formal Meta-discourse Markers Affect Learners’ Oral Proficiency?

Meta-discourse markers are an inevitable part of oral proficiency which improve both the quality and comprehension of learners’ speech. While studies of oral meta-discourse have been conducted since the 1980s in a European or US context, they have remained relatively untouched in Iran. Therefore, this study aimed to seek the impact of both explicit and implicit teaching of formal meta-discourse...

متن کامل

High Fuzzy Utility Based Frequent Patterns Mining Approach for Mobile Web Services Sequences

Nowadays high fuzzy utility based pattern mining is an emerging topic in data mining. It refers to discover all patterns having a high utility meeting a user-specified minimum high utility threshold. It comprises extracting patterns which are highly accessed in mobile web service sequences. Different from the traditional fuzzy approach, high fuzzy utility mining considers not only counts of mob...

متن کامل

How does Explicit and Implicit Instruction of Formal Meta-discourse Markers Affect Learners’ Writing Skills?

Discourse markers improve both the quality and comprehension of a written text. This study aimed at investigating the effect of explicit and implicit instruction of formal meta-discourse markers on writ- ing skills. The quantitative data were collected from 90 upper-intermediate students at Shiraz Univer- sity Language Center. Two experimental groups went through an instruction, while the contr...

متن کامل

Expert Discovery: A web mining approach

Expert discovery is a quest in search of finding an answer to a question: “Who is the best expert of a specific subject in a particular domain within peculiar array of parameters?” Expert with domain knowledge in any field is crucial for consulting in industry, academia and scientific community. Aim of this study is to address the issues for expert-finding task in real-world community. Collabor...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004